
A Streaming, Distributed Out-of-Core SVD Library on Hybrid CPU/GPU Architectures Applied to Fluid Mechanics Data
Please login to view abstract download link
Singular Value Decomposition (SVD) plays a crucial role in various scientific and engineering applications, particularly in numerical simulations and model reduction. SVD is widely used to extract dominant modes in high-dimensional data, enabling dimensionality reduction and simplifying complex systems in fluid mechanics, structural analysis, and other fields. As the size of datasets from large-scale simulations continues to grow, traditional SVD algorithms struggle to process such massive data volumes on single processors due to memory and computational limitations. To tackle these challenges, we propose a more efficient, distributed, out-of-core implementation of truncated Singular Value Decomposition (t-SVD), specifically designed for heterogeneous high-performance computing (HPC) systems. Our solution leverages the Levy-Lindenbaum method, which enables streaming data processing through an innovative batching strategy. This approach not only significantly reduces the computational burden of SVD but also minimizes the memory footprint, allowing for more scalable data decomposition. Our implementation takes advantage of CUDA to accelerate the underlying linear algebra computations required for both SVD and QR decompositions on GPUs. In particular, for QR decomposition, we employ the direct map-reduce computation method, on GPU resources to further optimize performance. Moreover, we incorporate MPI to handle process communication across different nodes in a distributed computing environment, facilitating seamless data exchange and coordination between processes. This new implementation is capable of processing extremely large datasets, potentially up to hundreds of terabytes in size, while maintaining optimal performance. It allows for the extraction of key modes from highly complex datasets, making it suitable for a wide range of applications. One key application of our approach is in fluid mechanics, specifically in Direct Numerical Simulation (DNS), used to capture detailed turbulent flow dynamics. DNS produces large datasets that overwhelm traditional SVD methods. By applying our distributed out-of-core SVD, we efficiently decompose these complex datasets, such as those from Rayleigh-Taylor Instability (RTI) simulations, extracting dominant flow structures and modes. This enhances understanding of turbulence and instabilities in fluid dynamics, addressing the increasing computational demands and data complexity in DNS.